Application Programming Interfaces

Chris Bail
Duke University

What is an Application Programming Interface (API)?

What is an Application Programming Interface (API)?

Growth of APIS

A list of APIs

 

There are now nearly 20,000 APIs and counting:

https://www.programmableweb.com/apis/directory

How does an API work?

 

Simple example with Google Maps API

Anatomy of an API Call

Output of API call:

API Credentials

Example: Facebook API

Now YOU try it!!!

1) Take a moment and try to see whether you can make a call for other types of information about yourself, or someone else.

2) What type of data can you get access to?

3) What type of data can you not access?

Rate Limiting

An Example with Twitter's API

Navigate to:

https://apps.twitter.com.

An Example with Twitter's API

Callback URL

Keys and Access Tokens

The rtweet Package

install.packages("rtweet")

Define your Credentials

app_name<-"YOURAPPNAMEHERE"
consumer_key<-"YOURKEYHERE"
consumer_secret<-"YOURSECRETHERE"

Authenticate Yourself with Twitter API

library(rtweet)
create_token(app=app_name, consumer_key=consumer_key, consumer_secret=consumer_secret,
set_renv = TRUE)

Your First API Call

korea_tweets<-search_tweets("#Korea", n=3000, include_rts = FALSE)

Browse the Results

names(korea_tweets)

Browse the Results

Browse the Results

head(korea_tweets$text)

Plot the Results

ts_plot(korea_tweets, "3 hours") +
  ggplot2::theme_minimal() +
  ggplot2::theme(plot.title = ggplot2::element_text(face = "bold")) +
  ggplot2::labs(
    x = NULL, y = NULL,
    title = "Frequency of Tweets about Korea from the Past Day",
    subtitle = "Twitter status (tweet) counts aggregated using three-hour intervals",
    caption = "\nSource: Data collected from Twitter's REST API via rtweet"
  )

Plot the Results

Next, let's search by Location

nk_tweets <- search_tweets("korea",
  "lang:en", geocode = lookup_coords("usa"), 
  n = 1000, type="recent", include_rts=FALSE
  )
geocoded <- lat_lng(nk_tweets)

Plot

par(mar = c(0, 0, 0, 0))
maps::map("state", lwd = .25)
with(geocoded, points(lng, lat, pch = 20, cex = .75, col = rgb(0, .3, .7, .75)))

Plot

Get Tweets from Individual Account

sanders_tweets <- get_timelines(c("sensanders"), n = 5)
head(sanders_tweets$text)

Get Tweets from Individual Account

Get General Information about a User

sanders_twitter_profile <- lookup_users("sensanders")

Browse Fields

sanders_twitter_profile$description

Browse Fields

sanders_twitter_profile$location

Browse Fields

sanders_twitter_profile$followers_count

Get Users' Favorites

sanders_favorites<-get_favorites("sensanders", n=5)
sanders_favorites$text

Get Users' Favorites

Get Networks

sanders_follows<-get_followers("sensanders")

Check Rate Limits

rate_limits<-rate_limit()
head(rate_limits[,1:4])

Check Rate Limits

Get Trending Topics by Location

get_trends("New York")

Get Trending Topics by Location

rtweet can even post tweets!

post_tweet("I love APIs")

Now YOU Try it!!!

1) Collect the most recent 100 tweets from CNN; 2) determine how many people follow CNN on twitter; and, 3) determine if CNN is currently tweeting about any subjects that are trending in your hometown.

Wrapping API Calls within a Loop

Wrapping API Calls within a Loop

#load list of twitter handles for elected officials
elected_officials<-read.csv("https://cbail.github.io/Elected_Officials_Twitter_Handles.csv",
                            stringsAsFactors = FALSE)
head(elected_officials)
                  name    screen_name
1   Sen Luther Strange SenatorStrange
2    Rep. Mike Johnson RepMikeJohnson
3             Ted Budd     RepTedBudd
4    Adriano Espaillat   RepEspaillat
5 Rep. Blunt Rochester  RepBRochester
6  Nanette D. Barragán    RepBarragan

Wrapping API Calls within a Loop

#create empty container to store tweets for each elected official
elected_official_tweets<-as.data.frame(NULL)

for(i in 1:nrow(elected_officials)){

  # #first, check rate limits
  rate_limits<-rate_limit()
  limit<-rate_limits[rate_limits$query=="statuses/user_timeline",]
  if(limit$remaining==0){
    Sys.sleep(15*60)
  }

  #pull tweets
  tweets<-get_timeline(elected_officials$screen_name[i], n=100)

  #populate dataframe
  elected_official_tweets<-rbind(elected_official_tweets, tweets)

  #pause for one second to further prevent rate limiting
  Sys.sleep(1)

  #print number/iteration for debugging/monitoring progress
  print(i)
}

Facebook's API

Go back to Graph API Explorer

Create Access Token Object

install.packages("Rfacebook")
library(Rfacebook)
token <- "INSERTYOURNUMBERHERE"

Your first Facebook API Call

getUsers("me", token=token)

Retrieve your Likes

my_likes <- getLikes(user="me", token=token)

Working with Public Pages

Working with Public Pages

duke_fb<-getPage("DukeUniv", token=token)

Now YOU try it

1) Find out which organization has more Facebook likes: CNN or the New York Times.

2) Determine which of both organization’s 100 most recent posts have received the most “likes.”

There are R packages for other APIs

 

Here are a few: RgoogleMaps, Rfacebook, rOpenSci(this one combines many different APIs e.g. the Internet Archive), WDI,rOpenGov,rtimes

Many more are available but not yet on CRAN (install from github or using devtools)

There are also APIs that do Analysis for You!

 

For example, visualization (plotly)

Designing your Own API Calls

Designing your Own API Calls

Challenges of Working with APIs

A list of APIs of interest